The accumulation of mismanaged plastic waste in the environment is a global growing concern. Knowing with precision where litter is generated is important to target priority areas for the implementation of new mitigation policies. In this project, using country-level data on waste management combined with world population distributions and long-term projections of population and the gross domestic product (GDP), we wanted to investigate whether or not GDP was the sole culprit in determining mismanaged waste per person per day. This is important becasue mismanaged waste is causing hundreds of thousands of people to die each year in the developing world from easily preventable causes, and plastic waste especially is adding a new and dangerous dimension to the problem.
For the purpose of this project we used three different data sets from an article on the Our World in Data site. The data is combined with gapminder world census data. The article explores the long-term impact of mismanaged litter and the chemical, ecological, behavioral, physical and health consequences of it.
This data was made available by Our World in Data, an online publication that focused on large-scale global issues, such as poverty, inequality, war, disease, and climate change. The data was gathered by researchers at Oxford University, and included 9 variables. Variables included country name, country code (or abreviation), year, per capita GDP, per capita plastic waste, total mismanaged waste, per capita mismanaged waste, coastal population, and total population. This data was then merged with Gapminder data, which contained mapping information that allowed us to visualize plastic waste on a global scale.
The data sets each had 186 observations for each of the variables. It should be noted that data is not provided for some countries due to either practical or logistical challenges in gathering data in these areas. These countries include but not limited to Bolivia, Kazakhstan, Paraguay, Mongolia, Afghanistan and some central African countries.
##Objective and Goals:
Our group’s objective for this project is to determine what the main predictors of mismanaged waste. We are hypothesizing that GDP per capita is the most indicative predictor of mismanaged waste per country. Additionally, we are hoping to find another good predictor in addition to GDP, giving more explanation into high mismanagement of waste.
The plot above shows the relationship between GDP per capita and per capita mismanaged waste. We see a downward slope to the plot, however it does not appear perfectly linear. Additionally, this plot shows the densities of each variable on the axis. The log was taken of these due to the left skew in both.
This faceted plot shows the same predictors split by continent. It’s obvious through this plot that the continents have differing effects in terms of the relationship between GDP and mismanaged waste, specifically in Africa. The relationship is slightly positive in Africa whereas it is negative in every other continent. Also, the correlations of each continent appear to be very high, which shows a strong relationship with continent as a predictor.
For this plot, we created a coastal population percentage variable which took the coastal population variable and divided it by the total population. We are unsure why there are percentages above one, but the data matches this with a higher coastal population than total population. This would be something to look into for the future. This plot does, however, show that the percent coastal population variable has some effect, specifically in terms of continent.
#Another Comparison of Plastic Waste to Mismanaged Waste
The first plot above combines GDP and continent as predictors for mismanaged waste. This time, population was added, but no clear relationship can be seen with this variable. The second faceted plot shows the 2d densities of our variables by continent. The biggest takeaway for us in this plot is the bimodal density in Asia, which most probably shows the difference between southeast asia and the rest of Asia. We can also see that oceania has some outlier points, which is simply due to the small data provided.
Map of countries showing log GDP
In the maps above, we first just wanted to show how GDP is spread throughout the world. The second side by side plots aim to show how the plastic waste is distributed and how this compares and changes in the map for mismanagement of waste. We can see that higher income countries produce more plastic waste and seem to have good methods for managing this plastic waste in comparison to lower income countries. There is also high waste mismanagement in Asia, which we attribute to disposal methods like those in the US, in which our plastic recycling was simply sent to China. This could, therefore, also account for the high mismanagement of waste in Asia.
In the plot above, we aimed to show any relationship that could exist between population and per capita mismanaged waste, which would therefore show that population is significant after it has already been accounted for. This plot shows us that this relationship does not exist.
With these models we wanted to show the continental effect on each variable: GDP, mismanaged waste and per capita mismanaged waste. We wanted to show this distribution to show the harsh differences that exist between continents, pushing it forward as a good predictor.
A generalized additive model (GAM) is a generalized linear model. The GAM model is weighted by total population.
With our final model, we saw a good fit included GDP and its interaction with continents as well as the percent coastal population variable as an additive variable in the model. We then weighted this by total population. From the beginning, GDP was our main focus, understanding that a high GDP would mean a country with high industrialization and therefore high waste. Additionally, due to the differences in continents that we saw in our exploration, this should be added as an interaction term to provide an overall better explanation for the mismanagement of waste. Though it has its limits, we found that the percentage of coastal population variable was a meaningful predictor and should be included in the model.
We did have some shortcomings in terms of data, specifically that we were missing some countries and only had a few variables. It would be interesting to find data on the distance of coastline to replace the coastal population percentage. We also think finding data on the number of industries or including the Gini index as a predictor would add a lot to the model.